首页> 外文OA文献 >Fast and accurate imputation of summary statistics enhances evidence of functional enrichment
【2h】

Fast and accurate imputation of summary statistics enhances evidence of functional enrichment

机译:快速准确地汇总汇总统计数据可以增强证据   功能丰富

摘要

Imputation using external reference panels is a widely used approach forincreasing power in GWAS and meta-analysis. Existing HMM-based imputationapproaches require individual-level genotypes. Here, we develop a new methodfor Gaussian imputation from summary association statistics, a type of datathat is becoming widely available. In simulations using 1000 Genomes (1000G)data, this method recovers 84% (54%) of the effective sample size for common(>5%) and low-frequency (1-5%) variants (increasing to 87% (60%) when summaryLD information is available from target samples) versus 89% (67%) for HMM-basedimputation, which cannot be applied to summary statistics. Our approachaccounts for the limited sample size of the reference panel, a crucial step toeliminate false-positive associations, and is computationally very fast. As anempirical demonstration, we apply our method to 7 case-control phenotypes fromthe WTCCC data and a study of height in the British 1958 birth cohort (1958BC).Gaussian imputation from summary statistics recovers 95% (105%) of theeffective sample size (as quantified by the ratio of $\chi^2$ associationstatistics) compared to HMM-based imputation from individual-level genotypes atthe 227 (176) published SNPs in the WTCCC (1958BC height) data. In addition,for publicly available summary statistics from large meta-analyses of 4 lipidtraits, we publicly release imputed summary statistics at 1000G SNPs, whichcould not have been obtained using previously published methods, anddemonstrate their accuracy by masking subsets of the data. We show that 1000Gimputation using our approach increases the magnitude and statistical evidenceof enrichment at genic vs. non-genic loci for these traits, as compared to ananalysis without 1000G imputation. Thus, imputation of summary statistics willbe a valuable tool in future functional enrichment analyses.
机译:使用外部参考面板进行插补是一种广泛使用的方法,可以提高GWAS和荟萃分析的功能。现有的基于HMM的插补方法需要个体水平的基因型。在这里,我们从汇总关联统计数据中开发了一种用于高斯插补的新方法,该方法已经广泛可用。在使用1000个基因组(1000G)数据的模拟中,此方法可恢复常见(> 5%)和低频(1-5%)变异体的有效样本量的84%(54%)(增加到87%(60%) )(可从目标样本获得summaryLD信息),而基于HMM的计算则为89%(67%),这不能应用于摘要统计。我们的方法解决了参考面板样本量有限的问题,这是消除假阳性关联的关键步骤,并且计算速度非常快。作为实验证明,我们将方法应用于WTCCC数据中的7种病例对照表型以及英国1958年出生队列(1958BC)中的身高研究。摘要统计中的高斯推算可恢复有效样本量的95%(105%)(如在WTCCC(1958BC身高)数据中,与227(176)个已发布SNP的个体水平基因型的基于HMM的估算相比,通过$ \ chi ^ 2 $ associationstatistics的比率进行量化。此外,对于从4种脂质特征的大型荟萃分析中获得的公开摘要统计数据,我们公开发布了1000G SNP的推算摘要统计数据,该数据无法使用以前发布的方法获得,并通过掩盖数据的子集来证明其准确性。我们显示,与没有1000G估算的分析相比,使用我们的方法进行的1000Gimputation可增加这些特征在基因位点与非基因位点的富集幅度和统计学证据。因此,归纳统计量将成为将来功能丰富分析中的宝贵工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号